The history of the CATH structural classification of protein domains
نویسندگان
چکیده
This article presents a historical review of the protein structure classification database CATH. Together with the SCOP database, CATH remains comprehensive and reasonably up-to-date with the now more than 100,000 protein structures in the PDB. We review the expansion of the CATH and SCOP resources to capture predicted domain structures in the genome sequence data and to provide information on the likely functions of proteins mediated by their constituent domains. The establishment of comprehensive function annotation resources has also meant that domain families can be functionally annotated allowing insights into functional divergence and evolution within protein families.
منابع مشابه
The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution
We report the latest release (version 3.0) of the CATH protein domain database (http://www.cathdb.info). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initia...
متن کاملCATH: an expanded resource to predict protein function through structure and sequence
The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the number of predicted protein domains in the previous version. The daily-updated CATH-B, which contains...
متن کاملThe CATH classification revisited—architectures reviewed and new ways to characterize structural divergence in superfamilies
The latest version of CATH (class, architecture, topology, homology) (version 3.2), released in July 2008 (http://www.cathdb.info), contains 114,215 domains, 2178 Homologous superfamilies and 1110 fold groups. We have assigned 20,330 new domains, 87 new homologous superfamilies and 26 new folds since CATH release version 3.1. A total of 28,064 new domains have been assigned since our NAR 2007 d...
متن کاملAssigning genomic sequences to CATH
We report the latest release (version 1.6) of the CATH protein domains database (http://www.biochem.ucl. ac.uk/bsm/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo-logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered i...
متن کاملExtending CATH: increasing coverage of the protein structure universe and linking structure with function
CATH version 3.3 (class, architecture, topology, homology) contains 128,688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version...
متن کامل